Goto

Collaborating Authors

 different task



Integrated perception with recurrent multi-task neural networks

Neural Information Processing Systems

Modern discriminative predictors have been shown to match natural intelligences inspecific perceptual tasks in image classification, object and part detection, boundary extraction, etc. However, a major advantage that natural intelligences still have is that they work well for all perceptual problems together, solving them efficiently and coherently in an integrated manner. In order to capture some of these advantages in machine perception, we ask two questions: whether deep neural networks can learn universal image representations, useful not only for a single task but for all of them, and how the solutions to the different tasks can be integrated in this framework. We answer by proposing a new architecture, which we call multinet, in which not only deep image features are shared between tasks, but where tasks can interact in a recurrent manner by encoding the results of their analysis in a common shared representation of the data. In this manner, we show that the performance of individual tasks in standard benchmarks can be improved first by sharing features between them and then, more significantly, by integrating their solutions in the common representation.


Replay-and-Forget-Free Graph Class-Incremental Learning: A Task Profiling and Prompting Approach

Neural Information Processing Systems

Class-incremental learning (CIL) aims to continually learn a sequence of tasks, with each task consisting of a set of unique classes. Graph CIL (GCIL) follows the same setting but needs to deal with graph tasks (e.g., node classification in a graph). The key characteristic of CIL lies in the absence of task identifiers (IDs) during inference, which causes a significant challenge in separating classes from different tasks (i.e., inter-task class separation). Being able to accurately predict the task IDs can help address this issue, but it is a challenging problem. In this paper, we show theoretically that accurate task ID prediction on graph data can be achieved by a Laplacian smoothing-based graph task profiling approach, in which each graph task is modeled by a task prototype based on Laplacian smoothing over the graph. It guarantees that the task prototypes of the same graph task are nearly the same with a large smoothing step, while those of different tasks are distinct due to differences in graph structure and node attributes. Further, to avoid the catastrophic forgetting of the knowledge learned in previous graph tasks, we propose a novel graph prompting approach for GCIL which learns a small discriminative graph prompt for each task, essentially resulting in a separate classification model for each task. The prompt learning requires the training of a single graph neural network (GNN) only once on the first task, and no data replay is required thereafter, thereby obtaining a GCIL model being both replay-free and forget-free. Extensive experiments on four GCIL benchmarks show that i) our task prototype-based method can achieve 100% task ID prediction accuracy on all four datasets, ii) our GCIL model significantly outperforms state-of-the-art competing methods by at least 18% in average CIL accuracy, and iii) our model is fully free of forgetting on the four datasets.


Multi-task Graph Neural Architecture Search with Task-aware Collaboration and Curriculum

Neural Information Processing Systems

Graph neural architecture search (GraphNAS) has shown great potential for automatically designing graph neural architectures for graph related tasks. However, multi-task GraphNAS capable of handling multiple tasks simultaneously has been largely unexplored in literature, posing great challenges to capture the complex relations and influences among different tasks. To tackle this problem, we propose a novel multi-task graph neural architecture search with task-aware collaboration and curriculum (MTGC3), which is able to simultaneously discover optimal architectures for different tasks and learn the collaborative relationships among different tasks in a joint manner. Specifically, we design the layer-wise disentangled supernet capable of managing multiple architectures in a unified framework, which combines with our proposed soft task-collaborative module to learn the transferability relationships between tasks. We further develop the task-wise curriculum training strategy to improve the architecture search procedure via reweighing the influence of different tasks based on task difficulties. Extensive experiments show that our proposed MTGC3 model achieves state-of-the-art performance against several baselines in multi-task scenarios, demonstrating its ability to discover effective architectures and capture the collaborative relationships for multiple tasks.




Supplementary Document

Neural Information Processing Systems

The pseudo-code of plugging our method into the vanilla BO is summarised in Algorithm 1. Therefore, our method is applicable to any other variants of BO in a plug-in manner. In this section, we present the proofs associated with the theoretical assertions from Section 2. To Lemma 1. Assume the GP employs a stationary kernel Lemma 2. Given Lemma 1, determining Proposition 2. Leveraging Lemma 2, suppose Lemma 3. As per Srinivas et al., the optimization process in BO can be conceptualized as a sampling Pr null |f ( x) µ(x) | ωσ ( x) null > δ, (24) where δ > 0 signifies the confidence level adhered to by the UCB. This lemma is directly from Srinivas et al. . The proof can be found therein. Theorem 1. Leveraging Corollary 1, when employing the termination method proposed in this paper, As discussed in Remark 2 of Section 2.2 in the main manuscript, we suggest initializing L-BFGS Different subplots are (a) our proposed method, (b) Naïve method, (c) Nguyen's method, (d) Lorenz's Different subplots are (a) our proposed method, (b) Naïve method, (c) Nguyen's method, (d) Lorenz's Different subplots are (a) our proposed method, (b) Naïve method, (c) Nguyen's method, (d) Lorenz's Different subplots are (a) our proposed method, (b) Naïve method, (c) Nguyen's method, (d) Lorenz's